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BACKGROUND OF THE IN VENTION 

Field of the Invention 

The present invention relates to image processing, and, in particular, to video compression. 

Cross-Reference to Rel ated Applications 

This application claims the benefit of the filing date of U.S. provisional application no. 
60/100,939, filed on 09/18/98 as attorney docket no. SAR 12728PROV. 



Description of the Related Art 

Video compression is employed to reduce the bandwidth required for transmission or storage. 

Many standards have evolved for video compression, such as H.261, H.263, H.263+, and the MPEG-1, 

2, and 4 standards. These standards use motion compensation and predictive coding where some 
1 5 frames are predicted from reference frames in order to achieve coding efficiency. They also use 

variable-length codes (VLCs) for the same purpose. While these techniques are excellent from the 

point of view of compression, in the presence of channel errors, they can cause propagation of errors 

over a large part of the sequence. 

Many tools have been developed to improve the error resilience of compressed video bit 
20 streams, such as resynchronization (resync) markers, data partitioning, and reversible VLCs, which are 

now are part of the MPEG-4 standard. 

When channel errors cause the decoder to lose synchronization of a compressed video 

bitstream that was encoded using VLCs, all the following data up to the next resync point in the 

bitstream will be lost. In the normal encoding mode, this resync point will be the start of the next 
25 picture. The use of resync markers splits each picture into video packets by explicitly introducing 

markers in the bit stream and ensuring that there are no dependencies across the packets. Thus, an 

error in a packet is confined within that packet. 

Data partitioning splits the data according to importance. For example, in motion-compensated 

predictive coding, the motion is usually more important than the residual (i.e., the inter-frame 
30 differences after motion compensation) in terms of importance for the perceived quality. If the motion 

data are placed earlier in the data packet than the residual data, then a channel error that occurs during 

transmission of the residual data will not affect the more-important motion data. This further increases 

the resilience of the bitstream in the presence of errors. Reversible VLCs provide additional 

localization of bit errors. 

35 In all the coding standards, there are intra-coded pictures (I frames) and predictive-coded 

pictures (P and B frames). In P and B frames, individual macroblocks can be coded in the intra mode. 
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i.e., without dependence on previously decoded information. Intra-coded pictures and macroblocks are 
excellent from the point of view of error resilience since they avoid the propagation of errors. 
However, their compression efficiency is very low. Also, in iow-delay applications such as video- 
phone and video-conferencing, intra frames may result in a large frame skip, following which the 
5 motion-compensated prediction will not be very effective. Also, the mean frame rate would drop and 
the motion will become very jerky. 

SUMMARY OF THE INVENTION 
The present invention is directed to a scheme that adaptiveiy refreshes different regions in the 
10 video (according to their relative importance) using intra-coded macroblocks to obtain good 
compression performance as well as resilience in the presence of errors. According to one 
embodiment, the present invention is a method for compressing frames of a video sequence, 
comprising the steps of (a) dividing each frame into two or more regions; (b) selecting one or more 
macroblocks in each region to be intra-coded; and (c) encoding each frame, wherein the selected 
1 5 macroblocks are intra-coded. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Other aspects, features, and advantages of the present invention will become more fully 
apparent from the following detailed description, the appended claims, and the accompanying drawings 
20 in which: 

Fig. 1 shows a frame of a video sequence, where the frame has been divided into three regions: 
a "most-important" region, a "less-important" region, and a "least-important" region; 

Fig. 2 shows a flow diagram of the processing corresponding to a refresh strategy for the video 
sequence of the frame of Fig. 1, according to one embodiment of the present invention; and 
25 Fig. 3 shows a flow diagram of the processing applied to a frame in an encoded video bitstream 

that was generated using the encoding processing of Fig. 2, when an error is detected at the decoder, 
according to one embodiment of the present invention. 

DETAILED DESCRIPTION 
30 Fig. 1 shows a frame 100 of a video sequence, where frame 100 has been divided into three 

regions: a "most-important" region 102, a "less-important" region 104, and a "least-important" region 
106. Fig. 1 corresponds to a typical video-conferencing scenario in which a talking person is located at 
the center of the picture (i.e., foreground region 102), where foreground region 102 is more important 
to the viewer of the decoded video stream than the background region 106. Region 104 corresponds to 
35 a transition region between the foreground region and the background region. Depending on the 
specific application, the regions can be selected differently, and different numbers of regions can be 
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selected. For example, a picture with two persons may have two most-important foreground regions, 
two less-important transition regions, and a single least-important background region. 

The present invention is directed to a strategy for refreshing different regions in a video 
sequence using intra-coded macroblocks that takes into account the relative importance of the different 
5 regions. The refresh strategy of the present invention may be implemented for the video sequence of 
frame 100 as follows: 

1 . For transition region 104 and background region 106, the user selects two numbers NJ and N2 
corresponding to the numbers of macroblocks, respectively, that will be intra-coded in these regions for 
every coded frame. The exact macroblocks to be coded for a particular frame are chosen at random. 

10 Eventually, over a long period, ail the macroblocks in these regions will be refreshed. 

2. In the most-important foreground region 102, the user selects N_SUCE, the number of 
macroblocks in this region to be intra-coded per frame. The value of N_SUCE depends on how much 
resilience is to be added to the bitstream at the expense of compression efficiency. The region is then 
divided into N_SUCE number of slices and one macroblock is intra-coded in each of these slices per 

15 coded frame. By keeping track of the last time value at which each macroblock was intra-coded, the 
macroblock in each slice that was least recently intra-coded is selected for intra-coding in the current 
frame. Within a slice, ties are resolved by selecting based on some specified rule, e.g., the macroblock 
furthest to the right in the slice. Optionally, each intra-macroblock can be sent in its own video packet 
to give additional protection to the bitstream. 

20 Fig. 2 shows a flow diagram of the encoding processing applied to each coded frame in the 

video sequence containing frame 100, according to one embodiment of the present invention. The 
processing begins by dividing the current frame into most-important region 102, less-important region 
104, and least-important region 106 (step 202). The analysis of step 202 is referred to as segmentation 
analysis, which, for purposes of the present invention, can be implemented using any suitable scheme. 

25 including automatic schemes or interactive schemes in which the regions of interest are explicitly 
identified by the user (e.g., a participant in a video-conference located either at the encoder or the 
decoder). In either case, the segmentation analysis can be performed adaptively throughout the video 
sequence. As such, the specific macroblocks that constitute the various regions can vary from frame to 
frame (e.g., as the talking person moves within the field of view). 

30 After the macroblocks of the three regions are identified in step 202, Nl macroblocks are 

randomly selected in the less-important region 104 for intra-coding (step 204) and N2 macroblocks are 
randomly selected in the least-important region 106 for intra-coding (step 206). The most-important 
region 102 is divided into N_SUCE slices (step 208) and the least-recently intra-coded macroblock in 
each slice is selected for intra-coding (step 210). Note that, if two or more macroblocks in a given slice 

35 were equally least-recently intra-coded. then the right-most of those macroblocks in the slice is selected 
for intra-coding. 
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After macroblocks have been selected for intra-coding in all of the regions, the frame is 
encoded, applying an appropriate intra-coding technique to the selected macroblocks (step 212). 

Fig. 3 shows a flow diagram of the processing applied to a frame in an encoded video bitstream 
that was generated using the encoding processing of Fig. 2, when a transmission or other bitstream 
5 error is detected at the decoder, according to one embodiment of the present invention. In particular, 
when the error is detected, the decoder discards all the data in the packet (or, in that partition, if data 
partitioning is used) (step 302). The decoder then uses a concealment strategy to fill in the missing 
macroblocks. If the motion vectors were decoded correctly (step 304), the decoder uses the motion- 
compensated macroblock for concealment (step 306). Otherwise, the decoder treats the missing 
1 0 macroblocks as skipped macroblocks and fills in the corresponding regions with the corresponding 
macroblocks in the reference frame (step 308). Although this concealment strategy could lead to the 
propagation of decoding errors over time, the refresh strategy of the present invention reduces this 
propagation of decoding errors and ensures good video quality even in the presence of transmission 
errors. 

15 Of course, the present invention can be implemented in a wide variety of alternative 

embodiments. In general, the present invention is directed to encoding and decoding schemes in which 
each coded frame in a video sequence is divided into two or more regions, where numbers of 
macroblocks to be intra-coded in each frame are specified for each region. Exactly how the 
macroblocks are selected for each different region in each frame can vary from one implementation to 

20 another. 

The present invention can be embodied in the form of methods and apparatuses for practicing 
those methods. The present invention can also be embodied in the form of program code embodied in 
tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable 
storage medium, wherein, when the program code is loaded into and executed by a machine, such as a 

25 computer, the machine becomes an apparatus for practicing the invention. The present invention can 
also be embodied in the form of program code, for example, whether stored in a storage medium, 
loaded into and/or executed by a machine, or transmitted over some transmission medium, such as over 
electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the 
program code is loaded into and executed by a machine, such as a computer, the machine becomes an 

30 apparatus for practicing the invention. When implemented on a general-purpose processor, the 
program code segments combine with the processor to provide a unique device that operates 
analogously to specific logic circuits. 

It will be further understood that various changes in the details, materials, and arrangements of 
the pans which have been described and illustrated in order to explain the nature of this invention may 

35 be made by those skilled in the art without departing from the principle and scope of the invention as 
expressed in the following claims. 
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CLAIMS 

What is claimed is: 

1 1 . A method for compressing frames of a video sequence, comprising the steps of: 

2 (a) dividing each frame into two or more regions; 

3 (b) selecting one or more macroblocks in each region to be intra-coded; and 

4 (c) encoding each frame, wherein the selected macroblocks are intra-coded. 

1 2. The invention of claim 1 , wherein, for at least one of the regions, the one or more macroblocks 

2 to be intra-coded are selected randomly in the region. 

1 3. The invention of claim 1 , wherein, for at least one other of the regions, the region is divided 

2 into two or more slices, and a least-recently intra-coded macroblock in each slice is selected for intra- 

3 coding for each frame. 

1 4. The invention of claim 3, wherein a specified selection rule is applied if there are two or more 

2 least-recently intra-coded macroblocks in a slice. 

1 5. The invention of claim 1, wherein: 

2 step (a) comprises the step of dividing each frame into a most-important region, a less-important 

3 region, and a least-important region; and 

4 step (b) comprises the steps of: 

5 ( 1 ) randomly selecting a first specified number of macroblocks in the least-important region to 

6 be intra-coded; 

7 (2) randomly selecting a second specified number of macroblocks in the less-important region 

8 to be intra-coded; and 

9 (3) dividing the most-important region into a third specified number of slices and selecting a 
10 least-recently intra-coded macroblock in each slice for intra-coding. 

1 6. The invention of claim 5, wherein a specified selection rule is applied if there are two or more 

2 least-recently intra-coded macroblocks in a slice. 

1 7. A computer-readable medium having stored thereon a plurality of instructions, the plurality of 

2 instructions including instructions which, when executed by a processor, cause the processor to 

3 implement a method for compressing frames of a video sequence, the method comprising the steps of: 

4 (a) dividing each frame into two or more regions; 

5 (b) selecting one or more macroblocks in each region to be intra-coded; and 
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6 (c) encoding each frame, wherein ihe selected macroblocks are intra-coded. 

1 8. A method for decoding a compressed video bitstream, comprising the steps of: 

2 (a) receiving the compressed video bitstream. wherein the compressed video bitstream was 

3 encoded by: 

4 ( 1 ) dividing each frame into two or more regions ; 

5 (2) selecting one or more macroblocks in each region to be intra-coded; and 

6 (3) encoding each frame, wherein the selected macroblocks are intra-coded; and 

7 (b) decoding the compressed video bitstream. wherein, if an error is detected in a data packet of 

8 the compressed video bitstream. then data in the packet are discarded and a concealment strategy is 

9 implemented for macroblocks corresponding to the discarded data. 

1 9. The invention of claim 8, wherein the concealment strategy comprises: 

2 (1) using motion-compensated data for the corresponding macroblocks. if motion vectors are 

3 accurately decoded; and 

4 (2) using non-motion-compensated reference data for the corresponding macroblocks, if the 

5 motion vectors are not accurately decoded. 

1 10. A computer-readable medium having stored thereon a plurality of instructions, the plurality of 

2 instructions including instructions which, when executed by a processor, cause the processor to 

3 implement a method for decoding a compressed video bitstream, the method comprising the steps of: 

4 (a) receiving the compressed video bitstream. wherein the compressed video bitstream was 

5 encoded by: 

6 ( 1 ) dividing each frame into two or more regions; 

7 (2) selecting one or more macroblocks in each region to be intra-coded; and 

8 (3) encoding each frame, wherein the selected macroblocks are intra-coded; and 

9 (b) decoding the compressed video bitstream, wherein, if an error is detected in a data packet of 

10 the compressed video bitstream, then data in the packet are discarded and a concealment strategy is 

1 1 implemented for macroblocks corresponding to the discarded data. 
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